166 PART 4 Comparing Groups
that follows the chi-square distribution (also covered in Chapter 24). So the test
statistic from this test should follow the chi-square distribution. Now it is obvious
why it is named the chi-square test! The next step is to obtain the p value for the
test statistic. To do that manually, you would look up the test statistic (which is
8.81 in our case) in a chi-square table.
In actuality, the chi-square distribution refers to a family of distributions. Which
chi-square distribution you are using depends upon a number called the degrees of
freedom, abbreviated d.f. or df or by the Greek lowercase letter nu (v) (in this book
we use df). The df is a measure of the probability of independence between the value
of the predictor (row) variable and value of the column (outcome) variable.
How would you calculate the df for a chi-square test? The answer is it depends on
the number of rows in the cross-tab. For the 2
2 cross-tab (fourfold table) in this
example, you added up the four values in Figure 12-5, so you may think that you
should look up the 8.81 chi-square value with 4 df. But you’d be wrong. Note the
italicized word independence in the preceding paragraph. And keep in mind that
the differences (Ob
Ex
–
) in any row or column always add up to zero. The four
terms making up the 8.81 total aren’t independent of each other. It turns out that
the chi-square test statistic for a fourfold table has only 1 df, not 4. In general, an
N-by-M table, with N rows, M columns, and therefore N
M cells, has only
N
M
1
1 df because of the constraints on the row and column sums. In our
case, N — which is the number of rows — is 2, so N-1 is 1. Also, M — which is the
number of columns — is 2, so M-1 is 1 also (and 1 times 1 is 1). Don’t feel bad if
this wrinkle caught you by surprise — even Karl Pearson who invented the
chi-square test got that part wrong!
So, if you were to manually look up the chi-square test statistic of 8.81 in a
chi-square table, you would have to look under the distribution for 1 df to find out
the p value. Alternatively, if you got this far and you wanted to use the statistical
software R to look up the p value, you would use the following code: pchisq(8.81, 1,
lower.tail = FALSE). Either way, the p value for chi-square = 8.81, with 1 df, is 0.003.
This means that there’s only a 0.003 probability that random fluctuations could
produce the effect seen, where CBD performs so differently than NSAIDs with
respect to pain relief in chronic arthritis patients. A 0.003 probability is the same
as 1 chance in 333 (because 1 0 003
333
/ .
), meaning very unlikely, but not impos-
sible. So, if you set α = 0.05, because 0.003 < 0.05, your conclusion would be that
in the chronic arthritis patients in our sample, whether the participant took CBD
or NSAIDs was statistically significantly associated with whether or not they felt
pain relief.